Automatic determination of parts of speech of English words

نویسنده

  • Lois L. Earl
چکیده

The classifying of words according to syntactic usage is basic to language handling; this paper describes an algorithm for automatically classifying words according to thirteen commonly used parts of speech: noun, adjective, verb, past verb, adverb, preposition, conjunction, pronoun, interjection, present participle, past participle, auxiliary verb, and plural or collective noun. The algorithm was derived by a computerized study of the words in The Shorter Oxford English Dictionary. In its operation it utilizes a prepared dictionary of around nine hundred words to assign parts of speech to special or exceptional words. Other words are split into affix and kernel parts and assigned a part of speech on the basis of the part-of-speech implications of the affixes and the length of the remaining kernel. An accuracy of 95 per cent is achieved from the point of view of inclusive part of speech, where inclusive part of speech is defined as that string which contains all the parts of speech attributed to the word by the dictionary but which may also contain one or two more parts of speech. Introduction

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Design and Implementation of an Intelligent Part of Speech Generator

The aim of this paper is to report on an attempt to design and implement an intelligent system capable of generating the correct part of speech for a given sentence while the sentence is totally new to the system and not stored in any database available to the system. It follows the same steps a normal individual does to provide the correct parts of speech using a natural language processor. It...

متن کامل

L2 Learners’ Lexical Inferencing: Perceptual Learning Style Preferences, Strategy Use, Density of Text, and Parts of Speech as Possible Predictors

This study was intended first to categorize the L2 learners in terms of their learning style preferences and second to investigate if their learning preferences are related to lexical inferencing. Moreover, strategies used for lexical inferencing and text related issues of text density and parts of speech were studied to determine their moderating effects and the best predictors of lexical infe...

متن کامل

Pronunciation modeling of foreign words for Mandarin ASR by considering the effect of language transfer

One of the challenges in automatic speech recognition is foreign words recognition. It is observed that a speaker’s pronunciation of a foreign word is influenced by his native language knowledge, and such phenomenon is known as the effect of language transfer. This paper focuses on examining the phonetic effect of language transfer in automatic speech recognition. A set of lexical rules is prop...

متن کامل

Hidden Markov Models and Dynamic Programming

1 Last week: stochastic part-of-speech tagging Last week we reviewed parts-of-speech, which are linguistic categories of words. These categories are defined in terms of syntactic or morphological behaviour. Parts-of-speech for English traditionally include: Nouns are concrete or abstract entity; Pronouns substitute for nouns; Adjective modify nouns; Verbs are actions or states of being; Adverbs...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Mech. Translat. & Comp. Linguistics

دوره 10  شماره 

صفحات  -

تاریخ انتشار 1967